robust classification
- North America > United States (0.14)
- Asia > China (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This work address the problem of learning a ranking prediction function that optimizes (N)DCG. The authors propose a surrogate loss based on a non-convex upper bound of the DCG, inspired from robust classification losses. The difference with other existing non-convex upper-bound resides in the fact that the authors introduce the non-convexity at the context level (on a whole query) and not at the pair of items level (see [8]). Then, the authors propose two applications of their algorithm with experimental studies: one is learning a prediction model for a search engine problem, the other to learn a representation for collaborative filtering.
- Summary/Review (0.55)
- Research Report (0.55)
Robust Classification of Oral Cancer with Limited Training Data
Sonawane, Akshay Bhagwan, Swamikannan, Lena D., Tamil, Lakshman
Oral cancer ranks among the most prevalent cancers globally, with a particularly high mortality rate in regions lacking adequate healthcare access. Early diagnosis is crucial for reducing mortality; however, challenges persist due to limited oral health programs, inadequate infrastructure, and a shortage of healthcare practitioners. Conventional deep learning models, while promising, often rely on point estimates, leading to overconfidence and reduced reliability. Critically, these models require large datasets to mitigate overfitting and ensure generalizability, an unrealistic demand in settings with limited training data. To address these issues, we propose a hybrid model that combines a convolutional neural network (CNN) with Bayesian deep learning for oral cancer classification using small training sets. This approach employs variational inference to enhance reliability through uncertainty quantification. The model was trained on photographic color images captured by smartphones and evaluated on three distinct test datasets. The proposed method achieved 94% accuracy on a test dataset with a distribution similar to that of the training data, comparable to traditional CNN performance. Notably, for real-world photographic image data, despite limitations and variations differing from the training dataset, the proposed model demonstrated superior generalizability, achieving 88% accuracy on diverse datasets compared to 72.94% for traditional CNNs, even with a smaller dataset. Confidence analysis revealed that the model exhibits low uncertainty (high confidence) for correctly classified samples and high uncertainty (low confidence) for misclassified samples. These results underscore the effectiveness of Bayesian inference in data-scarce environments in enhancing early oral cancer diagnosis by improving model reliability and generalizability.
- Asia > India > Tamil Nadu > Chennai (0.04)
- North America > United States > Texas > Dallas County > Richardson (0.04)
- Asia > Southeast Asia (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- North America > United States (0.14)
- Asia > China (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
Robust Classification under Noisy Labels: A Geometry-Aware Reliability Framework for Foundation Models
Bozkurt, Ecem, Ortega, Antonio
Foundation models (FMs) pretrained on large datasets have become fundamental for various downstream machine learning tasks, in particular in scenarios where obtaining perfectly labeled data is prohibitively expensive. In this paper, we assume an FM has to be fine-tuned with noisy data and present a two-stage framework to ensure robust classification in the presence of label noise without model retraining. Recent work has shown that simple k-nearest neighbor (kNN) approaches using an embedding derived from an FM can achieve good performance even in the presence of severe label noise. Our work is motivated by the fact that these methods make use of local geometry. In this paper, following a similar two-stage procedure, reliability estimation followed by reliability-weighted inference, we show that improved performance can be achieved by introducing geometry information. For a given instance, our proposed inference uses a local neighborhood of training data, obtained using the non-negative kernel (NNK) neighborhood construction. We propose several methods for reliability estimation that can rely less on distance and local neighborhood as the label noise increases. Our evaluation on CIFAR-10 and DermaMNIST shows that our methods improve robustness across various noise conditions, surpassing standard K-NN approaches and recent adaptive-neighborhood baselines.
Reviews: On the Hardness of Robust Classification
However it has not been made clear how do these impossibility results have any impact from a practical point of view. I would have appreciated if authors have provided some algorithms which achieves robust learnability even for a slightly relaxed setting, e.g. at least a robust learning algorithm for monotone conjugate functions for Thm. Line 200 -- unexpected occurrence of ",", Line 239 "a universal" repeated, Line 310: given -- gave etc. etc. 3. Seemingly incomplete results/ confusing theorem statements: a) Its very confusing as authors claim in the Introduction that: "On the other hand, a more powerful learning algorithm that has access to membership queries can exactly learn monotone conjunctions and as a result can also robustly learn with respect to exact in the ball loss." I would rather have appreciated the result lot more if proved for a more general concept class where sample complexity depends on some complexity measure (e.g. VC dimension) of C_n, etc. 4. Experiments: The paper does not provide any empirical studies, but I understand that this is rather out of the scope of the current results since all the claims are mostly made on non-existence of any robust-learning algorithms, but atleast authors could have proposed some algorithms for the monotone conjugate functions (i.e. for Thm.
Robust Classification Under Sample Selection Bias
In many important machine learning applications, the source distribution used to estimate a probabilistic classifier differs from the target distribution on which the classifier will be used to make predictions. Due to its asymptotic properties, sample-reweighted loss minimization is a commonly employed technique to deal with this difference. However, given finite amounts of labeled source data, this technique suffers from significant estimation errors in settings with large sample selection bias. We develop a framework for robustly learning a probabilistic classifier that adapts to different sample selection biases using a minimax estimation formulation. Our approach requires only accurate estimates of statistics under the source distribution and is otherwise as robust as possible to unknown properties of the conditional label distribution, except when explicit generalization assumptions are incorporated.
On the Hardness of Robust Classification
It is becoming increasingly important to understand the vulnerability of machine learning models to adversarial attacks. In this paper we study the feasibility of robust learning from the perspective of computational learning theory, considering both sample and computational complexity. In particular, our definition of robust learnability requires polynomial sample complexity. We start with two negative results. We show that no non-trivial concept class can be robustly learned in the distribution-free setting against an adversary who can perturb just a single input bit.
Robust Classification by Coupling Data Mollification with Label Smoothing
Heinonen, Markus, Tran, Ba-Hien, Kampffmeyer, Michael, Filippone, Maurizio
Introducing training-time augmentations is a key technique to enhance generalization and prepare deep neural networks against test-time corruptions. Inspired by the success of generative diffusion models, we propose a novel approach coupling data augmentation, in the form of image noising and blurring, with label smoothing to align predicted label confidences with image degradation. The method is simple to implement, introduces negligible overheads, and can be combined with existing augmentations. We demonstrate improved robustness and uncertainty quantification on the corrupted image benchmarks of the CIFAR and TinyImageNet datasets.
- Europe > Norway (0.04)
- Europe > France (0.04)
- Europe > Finland (0.04)
- Asia > Middle East > Saudi Arabia (0.04)
TriAug: Out-of-Distribution Detection for Robust Classification of Imbalanced Breast Lesion in Ultrasound
Ye, Yinyu, Chen, Shijing, Ni, Dong, Huang, Ruobing
Different diseases, such as histological subtypes of breast lesions, have severely varying incidence rates. Even trained with substantial amount of in-distribution (ID) data, models often encounter out-of-distribution (OOD) samples belonging to unseen classes in clinical reality. To address this, we propose a novel framework built upon a long-tailed OOD detection task for breast ultrasound images. It is equipped with a triplet state augmentation (TriAug) which improves ID classification accuracy while maintaining a promising OOD detection performance. Meanwhile, we designed a balanced sphere loss to handle the class imbalanced problem.
- Asia > China > Guangdong Province > Shenzhen (0.05)
- North America > United States (0.04)
- Asia > China > Hong Kong (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Diagnostic Medicine (0.69)